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Abstract 

Our  model  of  action  selection  and  postcompletion  error 
in  two  form  filling  tasks  extends  to  skip  errors  in  a  story 
telling  task.  We  also  discuss  how  it  explains 
perseverations  in  one  of  the  aforementioned  form  filling 
tasks.  Finally  we  discuss  a  predictive  classifier 
application  we  built  from  the  model’s  data.  The 
classifier  could  allow  an  autonomous  agent  to  know 
when  it  is  a  bad  time  to  interrupt  a  human,  when  a 
human  is  about  to  err,  and  how  to  help. 

Keywords:  Cognitive  Architecture;  Action  Selection;  Human 
Error;  Process  Model;  Predictive  Model 

Introduction 

We  are  probably  not  saddled  with  special  processes  that 
make  us  err.  If  neuroscience  could  find  the  locus  of  such  a 
curse,  neurosurgery  could  cure  us  of  the  reason  for  the 
expression  “to  err  is  human.”  Instead  it  is  more 
parsimonious  for  human  error  to  arise  naturally  out  of  the 
same  processes  we  use  to  select  our  correct  actions.  This  is  a 
story  about  how  a  process  model  of  action  selection  we 
originally  developed  to  explain  postcompletion  error  in  two 
form  filling  tasks  also  explains  skip  errors  in  a  story  telling 
task.  The  action  selection  model  presented  in  this  paper  was 
originally  used  to  explain  postcompletion  error  in  two  form 
filling  tasks  (Tamborello  &  Trafton,  2013a  &  b).  Other  than 
some  task-specific  details  it  remains  unchanged.  It  is  part  of 
a  larger  effort  to  establish  a  unified  process-level  action 
selection  and  error  model.  A  unified  framework  is  important 
because  one  cognitive  system,  i.e.  the  human  mind, 
produces  all  error  types  as  well  as  correct  behaviors.  Getting 
the  explanation  correct  for  one  or  more  phenomena  in  one 
task  then  acts  as  a  constraint  on  getting  the  explanation 
correct  for  other  error  types  as  well  as  correct  action 
selection.  Furthermore,  if  we  are  to  predict  error  in  complex 
task  environments  multiple  error  types  must  fall  naturally 
out  of  the  theory. 

The  model  works  within  the  framework  of  the  ACT-R  6 
cognitive  architecture  (Anderson  et  al.,  2004).  ACT-R  is  a 
hybrid  symbolic  and  subsymbolic  computational  cognitive 
architecture  that  takes  as  inputs  knowledge  (both  procedural 
and  declarative  about  how  to  do  the  task  of  interest)  and  a 
simulated  environment  in  which  to  run.  It  posits  several 
modules,  each  of  which  perform  some  aspect  of  cognition 
(e.g.,  long-term  declarative  memory,  vision).  Each  module 
has  a  buffer  into  which  it  can  place  a  symbolic 
representation  that  is  made  available  to  the  other  modules. 
ACT-R  contains  a  variety  of  computational  mechanisms  and 


the  ultimate  output  of  the  model  is  a  time  stamped  series  of 
behaviors  including  individual  attention  shifts,  speech 
output,  button  presses,  and  the  like.  It  can  operate 
stochastically  and  so  models  may  be  non-deterministic.  One 
of  the  benefits  of  embodying  a  theory  in  a  computational 
architecture,  such  as  ACT-R,  is  that  it  allows  researchers  to 
develop  and  test  concrete,  quantitative  hypotheses  and  it 
forces  the  theorist  to  make  virtually  all  assumptions  explicit. 
To  the  extent  that  the  model  is  able  to  simulate  human-like 
performance  the  model  provides  a  sufficiency  proof  of  the 
theory. 

Our  model  also  builds  upon  the  Memory  for  Goals  theory 
(Altmann  &  Trafton,  2002),  which  posits  that  we  encode 
episodic  traces  of  our  goals  as  we  complete  tasks.  Each  goal 
is  encapsulated  in  an  episodic  memory,  which  sparsely 
represents  what  was  the  current  mental  context  at  the  time 
of  its  encoding.  The  strength  of  these  memories  decay  over 
time  such  that  it  may  be  difficult  to  remember  the  correct 
point  at  which  we  resume  a  task  after  an  interruption. 
Memory  for  Goals  provides  a  process-level  theory  for  why 
certain  types  of  errors  are  made  during  a  well-learned  task 
as  a  consequence  of  retrospective,  episodic  memory 
(  Altmann  &  Trafton,  2007;  Ratwani  &  Trafton,  2010;  2011; 
Trafton,  Altmann,  &  Ratwani,  2009). 

In  this  report  we  describe  how  our  postcompletion  error 
model  selected  its  actions  and  also  explains  sequence  errors 
(perseverations  and  skips)  in  a  story  telling  task.  Some  task- 
specific  alterations  were  made  (noted  hence),  but  the  general 
principles  underlying  our  model  allow  it  to  work  in  this  new 
domain  without  modification  to  its  fundamentals.  One  can 
think  of  the  the  model  as  providing  a  highly  specified 
template  one  can  use  to  model  action  selection  and  error  in 
any  routine  procedural  task.  It  is  a  domain-specific  mini¬ 
architecture  built  within  the  more  general-purpose 
architecture  of  ACT-R. 

General  Principles  of  the  Model 

We  learned  the  general  principles  enumerated  below  by 
using  one  model  to  explain  one  error  type  in  two  data  sets, 
each  having  used  a  different  experimental  paradigm  (Byrne 
&  Bovair,  1997;  Altmann,  Trafton,  &  Ratwani,  2011).  Our 
claim  is  that  it  is  by  the  dynamic  interaction  of  these 
principles  that  people  select  their  actions  in  routine 
procedural  environments.  Perseverations  and  skips 
(postcompletion  error  is  really  just  a  skip  that  happens  to 
occur  in  certain  conditions)  fall  naturally  out  of  the  action 
selection  process.  This  is  because  sometimes  activations  of 
correct  and  incorrect  action  representations  become 
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comparable  in  strength  and  transient  noise  (a  property  of 
ACT-R)  in  that  moment  increases  the  wrong  action’s 
activation  beyond  the  correct  action’s  activation. 

Spreading  Activation 

During  normal  operation  the  model  selects  the  next  step  it 
will  perform  by  retrieving  a  step  from  long-term  memory. 
That  retrieval  process  is  driven  by  activation  spreading  from 
a  task  context  representation  it  holds  within  its  working 
memory.  Each  step  representation  is  associated  with  all 
subsequent  steps’  representations  by  dividing  the  maximum 
allowable  association  (an  ACT-R  parameter)  by  the 
prospective  distance  from  the  current  context  to  the 
subsequent  step.  E.G.,  the  association  from  the  current 
context  to  the  next  step  is  equal  to  the  maximum 
association,  from  current  to  next  +1  is  1/2  the  maximum 
association  (since  the  prospective  distance  is  2),  etc.  This  is 
meant  to  implement  a  kind  of  step  co-occurrence  association 
proportional  to  how  closely  two  steps  occur  with  each  other. 

Strengthening 

As  in  Memory  for  Goals,  as  the  model  performs  each  action 
it  encodes  an  episodic  memory  of  its  current  task  context. 
That  episodic  memory’s  activation  is  strengthened  at 
creation  so  that  it  is  significantly  higher  than  other,  previous 
episodic  memories  which  have  already  decayed  to  lower 
levels  of  activation.  As  in  ACT-R's  account  of  declarative 
long-term  memory,  the  memory  matching  the  retrieval 
request  (e.g.,  for  an  episode)  and  with  the  highest  activation 
at  request  time  is  the  memory  retrieved.  Furthermore, 
retrieval  of  a  memory  strengthens  its  activation. 

Functional  Decay 

As  in  ACT-R  and  Memory  for  Goals,  the  activation  of  a 
memory  decays  over  time  if  it  is  not  strengthened  again  by 
retrieval.  Besides  its  implication  in  forgetfulness,  decay 
serves  a  function  (Altmann,  2002).  When  old  memories 
decay,  they  allow  new  ones  to  start  with  activations  which 
make  them  relatively  more  likely  to  be  retrieved  than  the  old 
ones.  This  prevents  positive  feedback  loops  when  trying  to 
remember  different  instances  of  the  same  kinds  of 
information,  such  as  episodic  encodings. 

Interruption  and  Resumption 

The  context  we  focus  on  is  resuming  after  being  interrupted. 
With  the  rapid  rise  of  communication  technologies  that  keep 
people  accessible  at  all  times,  issues  of  interruptions  and 
multitasking  have  become  mainstream  concerns.  For 
example.  Time  magazine  (Wallis,  2006)  and  the  New  York 
Times  (Thompson,  2005)  both  reported  stories  about 
interruptions  and  multitasking  and  how  they  affect 
performance.  The  information  technology  research  firm 
Basex  issued  a  report  on  the  economic  impact  of 
interruptions,  which  they  estimated  to  be  around  $588 
billion  a  year  (Spira,  2005).  Given  the  prevalence  of 
interruptions,  building  systems  that  can  help  remind  an 
individual  what  they  were  doing  or  where  they  were  in  a 
task  can  have  a  large  impact  on  individual  and  group 
productivity. 


Being  interrupted  also  greatly  increases  the  number  of 
eirors  (Trafton,  Altmann,  &  Ratwani,  2011).  People  will 
frequently  repeat  a  step  that  has  already  been  performed  or 
skip  a  step  that  needs  to  be  performed  after  an  interruption. 
Sometimes  these  errors  are  irritating  (e.g.,  destroying  a  meal 
by  leaving  out  a  crucial  ingredient),  but  sometimes  they  can 
have  disastrous  consequences  (e.g.,  taking  medicine  twice 
or  not  configuring  the  flaps  for  airplane  takeoff).  The 
research  de-  scribed  here  is  applicable  to  these  domains,  but 
this  report  will  focus  on  a  common,  everyday  task:  being 
interrupted  while  telling  someone  a  story  or  giving 
instructions.  This  information-passing  task  is  an  excellent 
domain  for  studying  the  interruption/resumption  process  for 
several  reasons.  First,  because  it  is  so  common  to  get 
interrupted  while  talking  to  a  friend,  it  is  easy  to  collect 
data.  Second,  providing  ordered  information  to  another 
person  is  a  general  class  of  problems  that  include  recipes, 
checklists,  story  telling,  direction  giving,  etc. 

For  example,  in  the  middle  of  giving  you  instructions  on 
how  to  operate  a  new  device,  your  friend  needs  to  take  an 
important  phone  call  for  a  few  minutes.  When  she  comes 
back  to  tell  you  the  rest  of  the  instructions,  what  does  she 
do?  If  she  cannot  remember  exactly  where  she  left  off,  you 
may  remind  her  or  she  may  resume  where  she  thought  she 
left  off  (which  may  or  may  not  be  correct).  If  your  friend 
was  telling  you  a  story,  she  may  simply  start  somewhere 
close  to  where  she  left  off.  For  the  remainder  of  the  paper, 
we  will  focus  on  building  a  process  model  of  exactly  what 
the  interlocutor  is  doing  as  she  attempts  to  resume  the 
conversation,  then  we  will  relate  elements  of  this  model  to 
our  other  attempts  to  build  a  unified  process  model  of 
human  routine  procedural  action  selection  and  error. 

The  Story  Telling  Task 

The  important  point  of  Trafton,  Jacobs,  and  Harrison’s 
experiment  for  this  study’s  purposes  was  what  it 
demonstrated  about  how  people  resume  a  task  after  being 
interrupted.  Participants  read  three  total  pages  of  a  soap- 
opera-like  story,  then  retold  the  story  to  a  confederate.  After 
retelling  approximately  two-thirds  of  the  story,  participants 
in  the  interruption  condition  were  interrupted  by  the 
experimenter  at  a  predetermined  location.  The  control 
condition  served  to  verify  that  the  location  of  the 
interruption  was  not  an  especially  difficult  part  of  the  story. 

Resumption  lag  was  coded  as  the  time  from  the  end  of  the 
interruption  (or  the  intended  point  of  interruption  in  the 
control  conditions)  until  the  participants  began  to  fluently 
resume  the  story.  To  code  the  location  of  the  resumption, 
Trafton,  Jacobs  and  Harrison  coded  the  gist  of  the  story 
around  the  interruption  location,  and  marked  it  as  either 
“repeat”  (e.g.,  a  gist  utterance  that  was  a  repeat  of  what  had 
already  been  said),  “correct”  (e.g.,  the  next  gist  to  occur  in 
the  story),  or  “skip”  (e.g.,  an  utterance  that  skipped  the 
correct  resumption  gist).  Experiment  results  will  be 
presented  in  conjunction  with  model  results. 


Model  Operation 
Normal  Task  Execution 

The  model  began  each  gist-telling  cycle  by  retrieving  from 
declarative  memory  a  representation  of  one  of  the  story’s 
gists.  This  retrieval  process  was  driven  primarily  by 
associative  priming  (empirically  fit  to  a  maximum  of  1) 
from  the  model’s  active  buffer  contents.  Then  it  updated  its 
active  buffer  contents  by  copying  the  contents  from  its 
retrieved  memory.  Then  the  model  spoke  the  contents  of  the 
gist.  Next  the  model  encoded  an  episodic  memory  with  a 
reference  to  its  active  buffer  contents,  i.e.,  the  gist  it  just 
spoke.  Episodic  encoding  complete,  the  model  repeated  the 
cycle,  beginning  with  the  retrieval  phase.  In  contrast  to  the 
Phaser  and  Financial  Mangement  tasks  we  modeled 
previously,  we  assume  that  the  story  telling  task  has  a  flat, 
rather  than  hierarchical  goal  structure,  and  the  model’s 
declarative  memory  encoding  of  the  task  reflects  this. 

Interruption 

When  the  interruption  began  the  model  finished  encoding  its 
episodic  representation  as  per  normal  operation.  Then  it 
cleared  its  active  buffer  contents  and  simply  waited  for  230 
seconds,  the  average  interruption  duration  (Trafton,  Jacobs, 
&  Harrison,  2012).  The  episodic  memory’s  activation 
decayed  during  this  interval  using  ACT-R's  standard  decay 
mechanism  and  default  decay  rate. 

Resumption 

When  the  interruption  ended  the  model  first  tried  to  retrieve 
the  episodic  memory.  If  the  episode’s  activation  fell  below  a 
retrieval  threshold  (a  feature  of  the  ACT-R  architecture, 
empirically  fit  to  -2.375  for  this  model)  it  became 
unavailable  to  memory,  and  the  model  simulated  asking  for 
help.  When  the  model  did  successfully  retrieve  the  episode 
it  then  chose  one  of  two  competing  resumption  strategies  it 
could  employ:  resume  with  the  last  gist  told  resume  with  the 
next  gist. 

We  did  this  partly  because  we  found  that  the  model’s 
associative  action  selection  mechanism — a  constraint 
provided  by  the  model’s  performance  of  the  form  filling 
tasks  described  above — would  not  allow  for  sufficiently 
high  levels  of  skips  and  repeats.  In  the  form  filling  tasks  we 
previously  modeled  repeats  and  skips  occurred  at  rates  well 
below  10%.  Also  we  assume  resumption  place  is  subject  to 
a  sort  of  social  mnemonic  strategy  people  typically  employ 
to  remind  the  listener  of  a  narrative’s  context.  In  fact, 
empirical  testing  led  us  to  bias  the  model  slightly  in  favor  of 
this  strategy.  ACT-R’s  utility  parameter  for  this  strategy  was 
0.175  units  higher  than  the  resume-with-next  strategy’s 
while  the  standard  deviation  of  the  utility  transient  noise 
function  was  .1.  This  contrasts  with  Trafton  et  al.’s  model 
which  only  tried  to  resume-with-next. 

Here  as  in  normal  task  execution  the  model  relied  upon 
priming  by  spreading  activation  to  drive  retrieval  of  a  story 
gist.  However,  at  resumption  the  model  does  not  reconstruct 
its  entire  active  buffer  contents  all  at  once,  a  feature  hinted 
at  by  Trafton  et  al’s  findings  regarding  the  time  course  of 
resumption  from  interruptions  (Altmann  &  Trafton,  2007). 


This  means  that  in  absolute  terms  less  activation  was 
propagated  to  the  gist  retrieval  process  at  the  transition  from 
resumption  to  the  normal  execution  cycle.  This  lower  ratio 
of  associative  signal  to  transient  retrieval  noise  (0.25,  the 
same  value  for  this  ACT-R  parameter  as  in  Trafton  et  al.’ 
model)  made  skips  more  likely  than  during  normal  story 
telling  because  the  difference  in  propagated  activation 
between  the  current  +1  gist’s  memory  (the  correct  gist)  and 
the  current  +2  gist’s  (the  skip  gist)  memory  was  less  relative 
to  normal  execution  conditions.  However,  transient  noise 
was,  on  average,  the  same.  Therefore  transient  noise  had  a 
greater  influence  dining  resumption,  and  that  made  skips 
more  likely  than  during  normal  execution.  Sometimes  when 
the  model  selected  the  “resume  last  gist”  strategy  it  would 
skip,  ultimately  resulting  in  a  “correct”  resumption  or  even  a 
skip.  If  it  had  selected  the  “resume  next  gist”  strategy  it  was 
even  more  likely  to  ultimately  skip. 

This  model  is  very  different  from  Trafton,  Jacobs,  and 
Harrison’s  in  when  it  encodes  its  episodic  memories  and 
how  its  process  leads  to  skips.  That  model  actually  encodes 
its  episodes  after  retrieval  from  declarative  memory  of  the 
next  gist  to  tell  but  before  it  has  told  the  gist.  An  interruption 
may  (or  may  not)  fall  between  this  episodic  encoding  and 
when  a  person  has  the  chance  to  carry  out  an  action. 
Therefore  the  model  claims  that  in  these  cases  people  have  a 
false  memory  of  having  carried  out  an  action.  This  leads  to 
the  somewhat  odd  prediction  that  if  an  interruption  could 
somehow  be  reliably  timed  to  occur  during  action 
preparation — between  episodic  encoding  and  action 
performance — people  will  mostly  perform  skip  errors  at 
resumption  rather  than  the  correct  action.  When  the 
interruption  falls  before  the  episodic  encoding  their  model 
predicts  that  people  will  never  skip  at  resumption. 

The  current  model  predicts  skip  rates  to  remain 
unchanged  in  such  scenarios  because  they  fall  naturally  out 
of  the  action  selection  mechanism.  Preliminary  results  from 
another  task  in  which  subjects  skip  at  a  very  small  rate  (1%) 
even  when  not  interrupted  match  the  model’s  rate  for  such 
trials.  The  increased  skip  rate  at  resumption  comes  from  the 
juxtaposition  of  the  action  selection  mechanism  with  this 
reduced  active  buffer  content  condition  that  occurs  at 
resumption,  a  feature  supported  by  Altmann  and  Trafton's 
(2007)  findings. 

Furthermore,  Trafton,  Jacobs,  and  Harrison’s  model,  upon 
interruption,  immediately  stopped  and  switched  tasks, 
predicting  no  switch  lag  for  the  interrupting  task.  The 
current  model  has  so  far  been  applied  to  tasks  that  interrupt 
immediately  upon  completion  of  an  action.  In  this  case,  the 
model  is  just  beginning  its  episodic  encoding  as  the 
interruption  starts.  The  model  predicts  switch  lag  for  the 
interruption  task  is  a  consequence  of  this  “finish  up”  activity 
left  over  from  the  previous  action. 

Model  Fit 

To  reproduce  the  empirical  data,  we  ran  2000  simulated 
trials  with  a  (virtual)  listener  available.  For  modeling 
purposes  we  focus  on  time  to  continue  after  interruption  and 
where  resumption  occurred.  Participants  asked  for  help  or 
acted  like  they  wanted  some  help  77%  of  the  time.  When 
the  model  attempted  to  retrieve  an  episodic  code  after  the 
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Figure  1 .  Resumption  latencies  of  people  (bars)  and  model 
fit  (circles)  in  Trafton  et  al’s  story  telling  task.  Error  bars 
represent  the  95%  confidence  interval. 

interruption,  it  failed  and  asked  for  help  79%  of  the  time. 
When  the  model  failed  to  resume  on  its  own  it  was  because 
transient  noise  made  the  episodic  memory  less  active  than 
the  retrieval  threshold  The  threshold  functioned  as  a  cut-off 
to  abandon  a  retrieval  when  the  amount  of  effort  became 
excessive  without  leading  to  a  successful  retrieval.  Trafton 
et  al.  did  not  match  their  model  to  control  resumption 
latencies  nor  to  where  participants  resumed.  This  model 
does  match  both  in  addition  to  the  help  and  no  help  response 
latencies  and  help  frequency. 

Discussion 

The  novel  contribution  of  our  model  stems  from  its 
generalizability.  It  went  unchanged  from  postcompletion 
error  in  two  form-filling  tasks,  each  in  different 
experimental  paradigms,  to  sequence  errors  in  story  telling. 
The  two  form-filling  tasks  encouraged  participant  use  of  a 
hierarchical  goal  structure.  The  stoiy  telling  task  by  contrast 
arguably  let  to  flat  goal  structure.  Yet  despite  the  changes  to 
task  environment  and  type  of  error  under  examination,  the 
general  principles  of  the  model — action  selection  by 
associative  spreading  activation  based  on  step  co¬ 
occurrence.  strengthening,  and  functional  decay — remained 
unchanged. 

Generalization  to  Other  Paradigms 

Our  model’s  account  of  sequence  errors  holds  for  computer 
tasks  with  hierarchical  goal  structures,  as  was  used 
previously  to  study  PCE.  For  example.  Altmann,  Trafton 
and  Ratwani  (2011)  employed  a  type  of  form- filling  task 
called  the  Financial  Management  Task.  In  this  task 
participants  filled  out  a  form  to  buy  or  sell  financial 
securities  based  on  orders  received.  The  task  involved 
entering  information  from  the  orders  into  a  series  of 
interface  elements  such  as  pull-down  menus,  check  boxes. 


Figure  2.  Where  participants  (bars)  and  the  model  (circles) 
resumed. 

and  radio  buttons.  Elements  were  arranged  into  clusters  of 
tw  o  to  four  elements  all  relating  to  one  aspect  of  tire  order. 
There  were  ten  such  clusters  in  the  interface  and  the  task 
required  participants  to  enter  information  into  the  clusters 
following  a  specific  order. 

Participants  were  interrupted  occasionally  while  they 
performed  this  task.  When  they  resumed  the  interface 
provided  no  cues  to  aid  resumption  at  the  correct  next 
cluster  Participants  sometimes  perseverated  the  last  step 
they  had  performed  or  skipped  die  next  step  they  were 
supposed  to  perform. 

Perseveration  Although  the  most  recently-created  episode, 
the  one  created  for  the  action  just  performed,  had  the  highest 
base  level  activation,  at  this  point  the  previous  episode, 
although  decayed,  was  still  more  active  than  die  background 
activation  level.  Transient  noise  occasionally  caused 
previous  episodic  memories  (usually  the  next  most-recent) 
to  be  more  active  at  that  moment.  Perseveration  errors 
occurred  here,  when  the  model  woidd  retrieve  an  episode 
from  one  or  two  steps  ago  because  of  this  combination  of 
transient  noise  and  the  relative  recency  of  the  episode’s 
creation  did  not  allow  decay  quite  enough  time  to  reduce  its 
activation  far. 

Unlike  the  story  telling  task’s  interruption,  which 
involved  declarative  memory  encoding  of  additional  stoiy. 
the  financial  management  task’s  interruption  task  was  to 
solve  simple  arithmetic  problems  for  15  seconds.  We 
assiune  that  some  declarative  rehearsal  is  possible  dining 
this  time,  perhaps  interleaved  with  arithmetic  operations  as 
in  Sahucci  and  Taatgen’s  theory  of  cognitive  threading 
(Salvucci  and  Taatgen.  2008).  A  rehearsal  cycle  early  during 
the  interruption  could  retrieve  one  of  these  slightly  older 
episodes  as  described  above,  strengthening  that  memory’s 
activation  Subsequent  rehearsal  cycles  tended  to  strengthen 
whatever  episode  the  model  had  retrieved  at  the  onset  of  the 
interruption.  At  resumption  the  model  would  then  load  an 


older,  rather  than  the  current,  context  into  working  memory 
and  perseverate  an  older  step. 

Skips  The  model  would  perform  skips  in  the  financial 
management  task  for  exactly  the  same  reasons  as  in  the 
story  telling  task.  Associative  priming  from  active  buffer 
contents  “bleeds  over”  from  the  intended  target  of 
declarative  retrieval,  the  correct  next  step,  to  the  step  after 
that.  This  effect  is  magnified  under  conditions  of  degraded 
active  buffer  contents  such  as  occurs  during  post¬ 
interruption  resumption  as  in  the  financial  management  task 
or  in  high  memory  load  as  in  Byrne  and  Bovair’s  Phaser 
task. 

Potential  Issues  and  Future  Work 

Although  the  story  telling  model  presented  here  did  not 
actually  perform  the  interruption  task,  we  believe  the 
pertinent  aspect  of  the  interruption  was  simply  that  people 
did  not  engage  in  the  primary  task  for  some  portion  of  time 
as  a  well-established  and  general  mechanism,  decay, 
explained  this  aspect  of  interruption  and  resumption 
performance.  Future  iterations  of  the  model  incorporate 
developments  such  as  Salvucci  and  Taatgen's  Threaded 
Cognition  model  to  address  issues  such  as  what  happens 
when  interrupting  tasks  are  complex  and  demanding, 
particularly  of  declarative  memory. 

We  hope  to  apply  the  general  principles  learned  from  this 
model’s  development  to  difficult  human-computer 
interaction  problems.  For  example,  our  process  model,  for 
each  potentially-retrieved  memory  for  each  declarative 
memory  retrieval  operation,  produces  a  retrieval  probability 
distribution.  We  can  use  this  theoretically-derived  data  to 
build  an  action  model  for  application  in  an  autonomous 
teammate  for  a  human.  During  task  execution  the  robot 
builds  a  model  of  the  human’s  cognitive  state  based  on 
observed  actions  and  known  procedures.  When  the  set  of 
retrieval  probability  distributions  indicates  that  the  human’s 
memory  encoding  the  correct  next  task  step  is  not  clearly 
the  most  likely  to  be  retrieved  by  the  human  then  the  robot 
intercedes  to  remind  its  human  teammate  of  the  correct  task 
context.  Fluman-robot  interaction  benefits  because  the  robot 
uses  the  process  model  to  “know”  its  human  teammate  like 
a  human  teammate  would.  This  gives  the  robot  the 
capability  to  know  when  help  is  needed,  how  to  help,  and 
how  to  otherwise  remain  unobtrusive.  In  other  words,  the 
robot  would  know  when  it  would  be  a  bad  time  to  interrupt 
a  person. 

Example  Application:  An  Autonomous  Agent 
Sensitive  to  Interruption  Costs 

A  human  and  a  robotic  teammate  set  out  on  a  task.  Both 
know  the  task.  The  robot  observes  its  human  teammate  as 
they  perform  the  task.  The  robot  has  a  cognitive  model  of 
the  task  running  and  follows  along,  updating  the  model’s 
state  as  the  team  performs  the  task.  As  in  the  model 
presented  earlier  in  this  paper,  each  step  will  involve  a 
declarative  retrieval  of  the  relevant  step  memory  and  that 
means  the  robot’s  model  contains  the  data  necessary  to 
predict  the  human’s  performance.  We  took  that  data  and 


developed  a  classifier  to  predict  whether  or  not  a  skip  was 
imminent. 

The  variables  that  went  into  the  logistic  regression  were 
the  activations  of  the  correct  and  next  two  gist  memories  as 
well  as  the  amount  of  activation  spread  by  the  active  buffer 
contents.  As  expected,  the  activation  of  the  correct  next  gist 
memory  and  activation  spreading  from  active  buffer 
contents  were  highly  negatively  predictive  of  skip  outcome 
while  activations  of  the  next  two  memories  were  highly 
positively  predictive  of  skip  outcome. 

We  then  evaluated  classifications  of  the  logistic  model 
using  receiver-operating  characteristic  analysis.  We 
determined  the  optimal  decision  criteria  to  be  a  skip 
probability  of  33.5%  which  resulted  in  a  true  positive  rate  of 
79.7%  with  a  false  positive  rate  of  0.9%,  a  d’  of  3.17,  and 
area  under  the  ROC  curve  was  0.986.  The  area  under  the 
curve  represents  the  probability  that  the  logistic  regression 
model  will  rank  a  randomly  chosen  positive  instance  (i.e.  an 
error)  higher  than  a  randomly  chosen  negative  instance  (i.e. 
non-error)  (Fawcett,  2006;  Macmillan  &  Creelman,  2005). 
This  is  a  quite  high  degree  of  discriminability  and  it  means 
that  when  the  decision  model  predicts  a  probability  of  skip 
of  33.5%  or  greater,  the  robot  would  successfully  intervene 
in  79.7%  of  cases  of  when  the  human  would  have 
committed  a  skip  error  while  only  wrongly  interrupting  the 
human  0.9%  of  the  time. 

Conclusions 

We  started  this  project  with  our  extant  model  of 
postcompletion  error  in  two  experimental  paradigms,  the 
Byrne  and  Bovair  phaser  task  and  the  financial  management 
task.  This  model  contained  a  theory  about  how  people  select 
their  actions  when  engaged  in  a  routine  procedural  task 
because,  we  hypothesized,  the  same  mechanisms  underlying 
correct  behavior  must  also  cause  errors.  We  tested  the 
generalizability  of  our  account  by  applying  it  to  a  new  task 
domain  and  new  error  type,  the  story  telling  task  and  skip 
errors,  respectively.  The  model's  process  worked  very  well 
for  the  story  telling  task.  Furthermore,  its  mechanisms 
rooted  in  Memory  for  Goals  (Altmann  &  Trafton,  2002)  also 
provides  an  explanation  for  another  type  of  error, 
perseveration. 

We  then  applied  data  generated  by  the  model  to  a  binary 
classification  problem:  for  a  given  point  within  a  task,  is  a 
human  likely  to  commit  a  skip  error?  We  demonstrated  that 
we  can  correctly  predict  the  occurrence  of  skip  errors  in  the 
model  data  with  high  accuracy.  Furthermore,  because  our 
cognitive  model  performs  so  well  across  a  variety  of  routine 
procedural  tasks  we  strongly  suspect  that  our  classifier 
should  perform  well  across  that  domain  as  well. 

Future  work  may  validate  the  classifier’s  performance  for 
this  story  telling  task.  However  the  generalizability  of  the 
cognitive  model  upon  which  the  classifier  is  based  is  what 
gives  us  hope  for  it  one  day  being  truly  useful  in  the  more 
general  domain  of  routine  procedural  tasks.  This  includes 
many  tasks  in  which  one  would  want  to  use  a  robot — tasks 
that  are  dangerous,  dirty,  or  dull — and  especially  tasks 
which  demand  human  judgment  but  are  improved  with 
automated  systems,  such  as  transportation,  chemical 
processing,  and  energy  production,  to  name  a  few. 
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